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ABSTRACT 

We identify low redshift clusters and groups in the Sloan Digital Sky Survey (SDSS) and estimate 
their kinetic and c orrelation poten t ial en ergies. We compare the distribution of these energies to 
the predictions by |Yang fc Saslaw (2012) and in the process estimate a measure of an average 3- 
dimensional velocity and spatial anisotropy of a sample of clusters. We find that the inferred velocity 
anisotropy is correlated with the inferred spatial anisotropy. We also find that the general shape of the 
energy distribution agrees with theory over a wide range of scales from small groups to superclusters 
once the uncertainties and fluctuations in the estimated energies are included. 

Subject headings: cosmology: theory — galaxies: clusters: general — gravitation — large-scale struc- 
ture of universe — methods: analytical — methods: statistical 



1. INTRODUCTION 

Clusters and groups of galaxies are structures in the 
universe which have been defined using various criteria; 
groups are smaller clusters. These clusters may contain 
over a thousand member galaxies and occupy a few cubic 
megaparsecs. Although clusters can be identified by their 
density and size (e.g. |Abell|[1958 Herzogetal 



their shapes and structures differ. Tor example, |Herzog| 



19571 



et al. (19571 identify three classes of clusters: compact 
clusters which have a single nearly spherical dense con- 
centration of galaxies, medium compact clusters which 
are less dense and may have multiple concentrations of 
galaxies and loose clusters which do not have any out- 
standing concentrations of galaxies. 

Even these simple classes of clusters suggest greater 
complexity than just sp herical concentrations in a region 
of space. For example, Binggeli et al. (1987) found sig- 



nificant substructure in the core of the Vi rgo cluster as 
well a s pronounced double structure, and JHerzog et al.] 
(1957) classify the Virgo cluster as a medium compact 
cluster. The structure of the Virgo cluster, as the near- 
est large cluster, suggests that these irregular shapes are 
common. 

Such irregular shapes may result from mergers. 
Smaller groups fall into the central region of a cluster and 
form subgroups whose member galaxies are still tightly 
bound to each other. Irregular shapes resulting from sub- 
groups then disappear as a cluster virializes. However, 
many clusters have dynamical relaxation timescales on 
the order of a Hubble time, and their incomplete viri- 
alization suggests that irregular clusters with multiple 
concentrations of galaxies should be common in the uni- 
verse. 

The basic dynamical description of a cluster is its 6- 
dimensional phase space configuration such as a sphere 
with a density and velocity profile. More detailed de- 
scriptions of clustering include correlation functions, per- 
colation trees and counts-in-cells statistics. In particu- 



lar, the counts-in-cells description is especially suitable 
for this problem because it straightforwardly analyzes re- 
gions of space (cells) with a specified size and shape. In 
addition, the physics of this descrip tion can be derived 
from gravitational thermodynamics (Saslaw fc H amilton 
19841 or statistical mechanics ( Ahmad et al.|20U2[ ) where 
the galaxies are in quasi-equilibrium and interact in a 
grand canonical ensemble of cells. 

While self-gravitating systems, and thus cells, are not 
in strict equilibrium, they are in quasi-equilibrium when 
the average energies and thermodynamic quantities of an 
ensemble of cells change slowly compared to the dynam- 
ical timescale of a single cell. This means that the inter- 
mediate time averages of the ensemble are stable, while 
the local snapshot energies of a cell fluctuate about their 
quasi-equilibrium time averages. The quasi-equilibrium 
approximation thus allows us to study self gravitating 
systems using thermodynamics originally intended for 
systems in equilibrium. 

The resulting counts-in-cells distribution is thus 
known as the gravitational quasi-equilibrium distribu- 
tion (GQ ED), The sim plest form of the counts-in-cells 
GQED is flSaslawpOOOl ) 



fv(N) 



N(l-b) 
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(1) 

which describes the probability that a cell of volume V 
has N galaxies. This depends on the average number of 
galaxies in a cell N and a clustering parameter b which is 
related to the mean an d variance of fy(N) through the 
dimensionless equation (Ahmad et al.|[2002| 
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The GQED therefore describes the clustering of galaxies 
with no free parameters. 

The implications of this physi cal description for the 
shapes of clusters was studied in Yang & Saslaw (2012 



paper 1), which provided a method for determining the 
probability that a cell with N galaxies has a particu- 
lar kinetic energy and correlation potential energy. This 
method uses the statistical mechanics of the GQED to 
describe the probability that a cell has a given kinetic 
energy and cor relation potential energy , and is based on 
earlier work by Leong & Saslaw ( 2004 ) . 

The manner in which the internal structures and 
shapes of clusters of galaxies are connected to the large 
scale structure of the universe through the GQED there- 
fore provides an opportunity for an observational test 
of the theory in |paper l] and earlier work, as well as a 
means to study the internal structure of galaxy clusters. 
To do so, we analyze the New York University Value- 
added Galaxy Catalog (NYU-VAGC) derived from the 
Sloan Digital Sky Survey (SDSS). 

While a simulation may be seem to be easier to analyze 
with few uncertainties, simulations are essentially ap- 
proximations of the universe that involve other assump- 
tions and uncertainties that simplify the problem. These 
include the choice of simulation volume, the mass reso- 
lution and the initial conditions. Their associated effects 
may introduce further complications in a poorly designed 
simulation. Therefore, this paper examines comparisons 
with observations which can determine an appropriate 
set of constraints and parameters for future simulations. 

This paper is structured as follows: In section[2]we de- 
scribe the th eoretical backgro u nd wi th reference to ear- 



lie r work b y |Yang fe Saslaw 
as |paper l | 



( |2012[ ) which we refer to 
In particular, we analyze the relation be- 
tween the kinetic energy, gravitational correlation poten- 
tial energy and total energy within the framework of the 
GQED. In section[3]we describe the NYU-VAGC samples 
from the SDSS and our selection cuts. In section 3] we 
describe the procedures we develop to obtain the counts- 
in-cells parameters N and b, and the algorithm we use to 
identify cells that contain clusters from the catalog. In 
section [5] we compare our observations with the theory. 
Finally we discuss our results in section [6] In this paper 
we use Q m = 0.3, n k = 0.0, n A = 0.7 and H n = lOO/i 
km s _1 Mpc" 1 following planton et alT] ( |2005| >. 

2. THEORETICAL BACKGROUND 

The theoretic al background of this p aper is based on 
earlie r work b y Yang & Saslaw (2012) which we refer 
to as paper 1| To begin, we describe the kinetic and 



correlation potential energies of a cell in terms of scaled 
dimensionless variables and factor out the dimensional 
quantities such as the cell radius and average mass of 
a galaxy. Thus we write distances in units of cell radii, 



time in units of dynamical times (c.f. equation 23 below) 
and masses in terms of the average mass of a galaxy. 

From this, we can write the observed scaled dimension- 
less energie s of a cel l with N galaxies as (Equations (61) 
and (69) of paper 1 1 



W* 



4 (N-l) R /n(r/e) 



(3) 



9NC(e/R) \ n a 
for the scaled correlation potential energy and 

* H(e/R) 
for the scaled kinetic energy. Here R is the radius of a 



(4) 



cell and v is the peculiar velocity of a galaxy in units 
of cell radii per dynamical time. These scaled energies 
are written in terms of the average potential energy of 
a galaxy in a cell so that T* represents a ratio of the 
kinetic energy of a cell to the average potential energy 
of a galaxy and is related to the correlation virial ratio, 
while W* is a measure of the compactness of the galaxy 
distribution within a cell. 

In equations pi) and Q, n(r) is a dimensionless mod- 
ification factor to a point mass potential that describes 
its departure from a Newtonian point-source potential 
through 

0(r)=-^-«(r/e) (5) 

where e is a parameter that describes the strength of the 
modification . This modification may be caused by an ex- 
tended halo ( Ahmad et al.|2002 ) or a merging pair ( Yang 
et al.|20lT ) among other possibilities. The C (e/R) fac tor 
is related to «(r/e) through (equation (6) of paper 1 ) 
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and is of order unity whe n e is small compar ed to the 
cell radius (c.f. Figure 1 of Ahmad et al. 2002 1. In such 
cases, galaxies may be reasonably approximated as point 
masses. 

Because these scaled energies come from observations, 
T* and W» describe a snapshot of the cell which fluc- 
tuates about its quasi-equilibrium values T* and W*. 
These quasi-equilibrium energies are essentially time- 
averaged values of T* and W* taken over the fluctuation 
timescale of a cell. This is based on the property of the 
cosmological many-body problem that the macroscopic 
evolution of a region in quasi-equilibrium is slow (ap- 
proximately equal to or greater than a Hubble time, e.g. 
Saslaw|2000 1 compared to its crossing time so that equi- 
librium prevails approximately. 

Regions that are not in quasi-equilibrium generally 
have a lower entropy than regions that are, and their con- 
figurations will usually c hange toward the higher-entropy 
quasi-equilibrium state (Saslaw 2000 Section 15.2). The 
timescale for this relaxation is approximately the dynam- 
ical crossing time of the configuration, and is shorter 
than the quasi-equilibrium evolution timescale of the en- 
tire system. This is because the quasi-equilibrium evolu- 
tion timescale of the entire system is at least as long 
as a Hubble time, while clusters, being overdense re- 
gions, have a shorter crossing time. This suggests that 
quasi-equilibrium is a good approximation for statisti- 
cally homogenous cosmological self-gravitating systems 
of galaxies at any epoch ( Sasl aw] |2000| Section 25.6). 
The observed ga laxy distribution strongly supports this 



assumption (e.g . Sivakoff & Sasl aw|2005 Rahmani et al. 



2009)|Yang fc Saslaw|)20ll| ). 

Therefore, we emphasize the distinction between the 
snapshot and quasi-equilibrium (time averaged) quanti- 
ties because clusters of galaxies are not in strict equilib- 
rium and their energies will fluctuate about their quasi- 
equilibrium value. This intrinsic fluctuation will mean 
that the observed snapshot energies of a specific clus- 
ter of galaxies may not agree with its quasi-equilibrium 
value, but the time-averaged energy of a cluster and the 



energies of an ensemble of clusters will be distributed 
about quasi-equilibrium. 

To complete our description of the clustering of galax- 
ies in a cell, we introduce two related quantities: The 
scaled correlation energy E„ = W* + T» and the correla- 
tion virial ratio ip = —W*j2T*. The quasi-equilibrium 
counterparts to these quantities arc similarly defined 
as E„ — W* + T* and ip — — W*/2T*. In quasi- 
equilibrium, th ese quantities are furt her related to each 
other through (Leong & Saslaw 2004 and |paper~T ) 
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These energies are also related to the larger ensem- 
ble through the virial ratio. In particula r, the clustering 
param eter can also be written as (e.g. Saslaw & Fang 
19961 



b=- 



W_ 
2K 



(10) 



where W is the ensemble average correlation potential 
energy and K is the ensemble average kinetic energy. 
While equation (10) suggests that b and ip are similar, 



they describe very different systems. Here, ip describes 
the quasi-equilibrium correlation virial ratio for a single 
cell, while b describes the ensemble average over all cells 
in the ensemble. In particular, there may be cells that 
are virialized, with a value of ip close to 1, while other 
cells may have a lower value of ip. The average value 
of ip taken over all cells in the ensemble is b and the 
distribution of ip, which we discuss in the rest of this sec- 
tion, is closely related to the distribution of cell energies 
P(E,N). 

2.1. Probabilities 

The probability that a cell with N galaxies in quasi- 
equilibrium in a grand canonical ensemble has total en- 
ergy E is gi ven by the usual result from statistical me- 
chanics (e.g. Leong & Saslaw 2004) 



e -E/T Nn/T 

P(E, N)dE = g(E) dE 



(11) 



where g(E) is the density of states having energy E, and 
To, /i and Zq are the temperature, chemical potential 
and partition function of the grand canonical ensemble. 
Here we use units of temperature where the Boltzmann 
constant is unity so temperature has energy units. This 
relates the scaled energies to the GQED. 

To write equation ( 11 ) i n terms of the scaled energies, 
we use results describea in paper 1 The first is to project 
the grand canonical ensemble into a canonical ensemble 



P N {E)dE = f v (N)P{E\N)dE 



-E/T N^/To 

fv(N)g(E) — dE 



(12) 



where P(E\N)dE is the conditional probability that a 
cell has energy E given that it has N galaxies, and JV(iV) 
is the counts-in-cells distribution of equation (fTl) . Then, 
we use the entropy o f a canonical ensem ble of galaxies 
in quasi-equilibrium (Ahmad et al. 2002) to obtain the 



density of states in terms of T* 



g{E*[T* 
" 2TT 



dfl dT* 



dT* dE* 
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where f2 is the number of energy states for a small range 
of energy AE so that Q = g(E)AE. 

Using equation (13), and the fugacity e N,l / To and the 
grand canonical partitio n fun ction Zq from Ahmad ct al. 
(2002) to w rite equat ion ( 11 ) explicitly, we get (c.f. equa- 
tion (33) of paper 1 ) 



P(E*[Ti,]\N)dE* = 



x(l + T/ 3 ) 
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N)(l-b) (14) 



which describes the probability that a cell with N galax- 
ies has a quasi-equilibrium scaled correlation energy of 
E*. This depends on the scaled quasi-equilibrium ki- 
netic energy T* , the mean number of galaxies in a cell N 
and the clustering parameter b. This dependence on N 
and b relates the internal structure of a cell to the large 
scale structure of the universe. 

2.2. Normalization 



To normalize the probability in equation (14), we con- 
sider the range of T* that represents quasi-equilibrium. 
In the limit of weak gravitational interactions, T* — > oo 
and the system approximates an ideal gas which is a 
limit of the GQED. For the case of small T*, we use the 
condition that in virial equilibrium, the crossing time of 
a cell is approximately its dynamical time. This gives 
us a minimum value of ?Vmin ~ 0.1. We therefore use 
T* !m in = 0.1 following paper 1 and normalize the proba- 
bilities for the range 0.1 < T* < oo. 

To numerically calculate t he p robability and normal- 
ization, we rewrite equation ( 14 ) in terms of ip using the 
change of variables 



P(ip\N)dip = P(E*\N) 



dE* dT* 



dT* dip 



dip (15) 



so that the normalization integral becomes 



.„.„„, - ,_ P(iP\N)dip. 



(16) 



Equation Q indicates that the quasi-equilibrium limits 
of ip are within the range < ip < 1. Thus, writing the 
probability in terms of ip transforms the normalization 
integral into a definite integral and simplifies its numeri- 
cal evaluation. We illustrate the relationship between T* 
and ip in figure 111 
The probability in terms of ip is given by 



P{ip 1 < i\) < ip 2 ) = 



1 



^2 



P(iP\N)diP (17) 



1 /v,norm J^ x 

from which the probability P(T*,i < T* < T * )2 ) follows 



as 



P{T, A <T,< T,, 2 ) = P(^[T»,i] < V> < ^[r*,a]). (18) 



Here we use equation (|9j) to get ip as a function of T* . 

The probability that a cell has an energy of W '* or 
E* is complicated by the fact that W '* [ip] and E* [ip] are 
double-valued. This is because virialized systems have a 
negative specific heat, given by 



C v 
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(19) 



AT.y 



so that £*[T«] is double valued. This has a tra nsition 



point from positive to negative at T* = 0.54 (Leong 



& Saslaw 



2004) which corresponds to ip = 0.86. The 
double- valued nature of W* [ip] follows from the defini- 
tion of W* = E* — T* and has a minimum at ^ = 2/3. 
To illustrate this, we plot W* [ip] in figurellj 

To calculate the probability for W* or E*, we add the 
probability for both solutions to get 

P(W*,i < W* < W*, a ) = P(^_[W*,i] < ^ < ^_[W*, 2 ]) 

+ p{i> + \w*,i]<^<i>+W*.2]) 

(20) 

where ^»_[W*] and V'+[W*] describe the conversion be- 
tween W * to ip for different solutions. The probability 
for E* is similarly defined. 

These probabilities cover a wide range of conditions 
from virialized clusters to unboun d collectio ns of galax- 
ies, and are described in detail in paper 1 A key pre- 
diction of this theory is that most clusters are bound 
and virialized with negative specific heat, and clusters 
with more galaxies are very likely to have a negative spe- 
cific heat. We illustrate these probabilities in figure [2] for 
N = 5 and N = 15 and show that the negative specific 
heat branch dominates the probability, and the probabil- 
ity that a cell with 15 galaxies has positive specific heat 
is negligible. Therefore we focus on cells with less than 
20 galaxies because these cells have more pronounced 
features in the positive specific heat branch of the P(ip) 
histogram. 



3. CATALOG DATA 

The New York University value-add ed galaxy cata- 
log (NYU-VAGC, |Blanton et~aT1|2005| ) is a composite 
catalog with the Sloan Digital Sky Survey (SDSS) data as 
its primary component. It contains over 550,000 galaxies 
with their redshifts and positions on the sky. The cat- 
alog also contains extinction corrected and if-corrected 
absolute magnitudes for 8 bands, of which the u, g, r, i 
and z bands come from the SDSS and the J, H and K s 
bands come from the 2-Micron All-Sky Survey (2MASS) 
although for this study we use only the data from the 
SDSS. The galaxies in the catalog are also corrected for 
fibre collisions using the "nearest" method described in 
Blanton et al.| (2005). Less than 10% of the galaxies are 



(|2005j) 



affected by this correction which allows for a more com- 
plete sample in crowded regions. 

In addition to the galaxy catalog, the NYU-VAGC also 
contains a survey geometry catalog that describes the 
su rvey footprint in term s of spherical polygons (described 

Blanton et al.|[2005). Since the SDSS is not an all-sky 



survey, trie survey footprint determines the positions of 
cells and allows us to lay down cells where there is valid 
data. 

For this work, we use the large scale structure samples 
in the version of the catalog correspo nding to the s eventh 



data release of the SDSS dAbazajian et al.||2009[ DR7). 
We use the subsample with a flux limit ot r < 17.6 and 
perform further selection cuts to obtain a volume and 
flux limited sample within a given redshift range. 

To obtain a sufficiently large sample of galaxies, we use 
a low redshift sample in the range 0.01 < z < 0.12. This 
redshift range contains a number of interesting struc- 
tures including the Coma and Leo clusters, as well as 
the SPSS great wall. Using th e criteria for complete- 
ness from Yang & Saslaw (2011 ), the sample is complete 
in this redshift range for a z-band absolute magnitude 
M z < —20.8. This limit is close to L* for a low redshift 



sample ( Blanton et al.|2~003 ). 
Instead ot the r-band data used in |Yang fc Saslaw| 



(20111, here we use the z-band because these galaxies 
are at a low redshift and the number counts in the ri- 
band complete sample are comparable to the r-band sam- 
ple. The other reason for using the z-band is that longer 
wavelengths are more sensitive to older stellar popula- 
tions and are less prone to dust extinction. The z-band 
luminosities therefore provide a more consistent means of 
identifying galaxies of different morphologies and ages. 

In addition to the complete sample, we use two other 
samples with a faint limit that is 1.2 and 1.8 magnitudes 
brighter. These brighter samples select for more massive 
galaxies that are more likely to dominate the potential 
of a cell and be the center of a large satellite system. If 
these satellite systems are virialized, they may be sub- 
clusters that form the building blocks of a rich cluster. 
This is particularly important becaus e a comp arison of 
bound and virialized probabilities in |paper 1] suggests 
that clusters with more than about 10 members are very 
likely to be bound and virialized. These brighter galax- 
ies may trace the positions of subclusters and allow us 
to represent a rich cluster by a small number of subclus- 
ter centers and thus obtain information about the larger 
scale assembly of clusters. 

To increase the number of possible clusters for these 
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Figure 1. Left plot: T» as a function of ip using equation d9p - The vertical dashed line separates the two solutions of i/>[-E»] which 
correspond to unvirialized cells with positive specific heat for if) < 0.86 and mostly virialized cells with negative specific heat for ip > 0.86. 
T*[i/j] goes to infinity as ip goes to 0. Right plot: W* as a function of ip using equations QSp and 19k. The vertical dashed line at ip = 2/3 
separates the two solutions of j/;[W*]. 



brighter samples, we also consider an extended redshift 
range of 0.12 < z < 0.2 for the brighter samples. We 
summarize these samples in table [T] and plot these limits 
on the observed luminosity function in figure [31 

4. DATA ANALYSIS AND PROCEDURE 

Having selected subsamples of the NYU-VAGC for 
analysis, we identify clusters in the catalog, and deter- 
mine the sample values of N and b. Here we define a 
cluster as a concentration of galaxies in a cell without 
regard to its virialization, and identify the cells that host 
these clusters. This definition of a cluster is particularly 
convenient because we can identify clusters solely on the 
basis of the positions of its members. 

For our analysis, we select cell sizes that are represen- 
tative of clusters in general. The cells are circular with a 
projected radius R on the sky of 2.0, 5.0, 10.0 and 20.0 
h~ l Mpc where h is related to the Hubble parameter 
-Ho by h = -ffo/100. The smaller cell sizes describe the 
scale of a typical cluster or group of galaxies. The larger 
cell sizes describe the clustering of clusters and groups. 
Here, we describe such large structures as superclusters 
because they are essentially clusters of galaxy clusters 
and groups. 

In the radial direction, the cells are defined by selecting 
velocity dispersions of A(cz) of 500, 1000 and 1500 km/s 
such that galaxies with a redshift within A(cz) of the 
cell's central redshift are considered members. This ap- 
proximately cylindrical cell geometry reduces the effect 
of redshift space distortions by averaging over a range 
of redshifts and allows us to select cluster members by 
their peculiar velocities. Because of redshift space distor- 
tions, the use of spherical cells to identify clusters is not 
possible without a sufficiently precise secondary distance 



Yang fc 



measure to all the galaxies in the cell. 

To calculate N and b, we use the procedure in 
Saslaw (2011) with our essentially cylindrical cell geom- 
etry instead of spherical cells. Because we are interested 
in the average values of N and b, we do not calculate de- 
tailed error estimates for their inferred values, but note 
that because of cosmic variance, the estimated values of 
N and b ge nerally vary by about 25% from quadrant 
to quadrant ( |Yang fc Saslaw 2011). We summarize our 
results in table [21 

4.1. Cluster Identification 

To identify clusters and concentrations of galaxies, we 
use a modified version of th e procedure described in sec- 
tion 2.2 of Wen et al. ( 2009 ) to take into account different 
cell sizes. This procedure identifies the cells having the 
most galaxies in the sample with the cells most likely to 
host a cluster. The algorithm we use is as follows: 

1. For each galaxy in the sample, we assume that it is 
the central galaxy of a cluster and count the num- 
ber of galaxies within a projected distance of R and 
a redshift range of A(cz). We make no assumption 
as to whether these galaxies are cluster members or 
background galaxies. However, with an appropri- 
ate choice of A(cz), most of the identified galaxies 
are cluster members. In addition, we require that 
at least 95% of the projected cell area is within 
the SDSS footprint. This will exclude cells close 
to the edge of the SDSS footprint that may have 
uncertain counts. 

2. To avoid repeated identifications and overlapping 
cells, we remove overlapping cells using the follow- 
ing procedure: 
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Figure 2. P(tp) for different values of TV and b plotted as a function of ip. The left panel is for TV = 5 and the right panel is for TV = 15. 
The positive specific heat branch covers -0 < 0.86 and the negative specific heat branch covers if) > 0.86. 
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Figure 3. 2-band Observed luminosity function for the NYU-VAGC at 0.01 < z < 0.12 (left) and 0.12 < z < 0.20 (right). The vertical 
lines indicate the absolute magnitude cuts we have adopted. Here h = Hq/100 km s — 1 Mpc — . 



(a) Sort the list of cells by number of galaxies in 
descending order. If two cells have the same 
number of galaxies, the cell with the brighter 
galaxy is placed first. 

(b) Select the first cell in the sorted list. 

(c) Remove all other cells that overlap with the 
selected cell. 

(d) Select the next cell in the list and repeat steps 
(c) and (d) until the end of the list is reached. 

This procedure places emphasis on rich clusters by 
preferring clusters that have more members. In 



such clusters, the pairwise distance is smaller than 
in sparser clusters, and therefore the members are 
more likely to be bound to each other. 

3. To get a list of cells that host a cluster in the sam- 
ple, we remove cells with N = or N = 1. 

Using this procedure, we identify clusters in the cata- 
log using the given samples and cell sizes. In these sam- 
ples, we take special note of cells with TV < 20 because 
these cells are more likely to have a non-negligible chance 
of being unvirialized. We summarize the results of the 
cluster detection algorithm in table [3] with the number of 



Table 1 

Selected subsamples 



Sample Magnitude 
M-51og(/i) 



Rcdshift 



Density n 
h~ 3 Mpc 3 



Galaxies 



la(z) 


M 2 < 


-20.8 


0.01 < z < 0.12 


4.56 x 10~ 3 


149418 


lb(z) 


M z < 


-22.0 


0.01 < z < 0.12 


4.13 x 10~ 4 


13535 


lc(z) 


M z < 


-22.6 


0.01 < z < 0.12 


5.34 x 10" 5 


1749 


2b(z) 


M z < 


-22.0 


0.12 < 2 < 0.20 


5.37 x 10~ 4 


59162 


2c(z) 


M z < 


-22.6 


0.12 < 2 < 0.20 


9.42 x 10~ 5 


10388 



Table 2 

-band Counts-in-cells fy(N) 



Sample 



A(cz) 

km/s 



Cells 



N 





R = 


= 2.0/1" 1 Mpc 




la(z 


500 


1157095 


0.542 


0.470 


la(z 


1000 


1107836 


1.09 


0.522 


la(z 


1500 


1059759 


1.62 


0.542 


lb(z 


) 500 


1157095 


0.0486 


0.154 


lb(z 


) 1000 


1107836 


0.0961 


0.189 


lb(z 


) 1500 


1059759 


0.143 


0.203 


lc(z) 


500 


1157095 


0.00625 


0.037 


ic(z; 


1000 


1107836 


0.0123 


0.049 


ic( z ; 


1500 


1059759 


0.0182 


0.055 




R = 


= 5.0/1" 1 Mpc 




la(z 


500 


134329 


3.38 


0.649 


la(z 


1000 


128923 


6.74 


0.696 


la(z 


1500 


123334 


10.1 


0.712 


lb(z 


) 500 


134329 


0.302 


0.306 


lb(z 


) 1000 


128923 


0.597 


0.360 


lb(z 


) 1500 


123334 


0.884 


0.381 


lc(z] 


500 


134329 


0.0386 


0.094 


ic( z ; 


1000 


128923 


0.0760 


0.121 


lc(z) 


1500 


123334 


0.111 


0.135 




R = 


lO.O/i- 1 Mpc 




la(z 


500 


132662 


13.4 


0.741 


la(z 


1000 


132662 


26.8 


0.782 


la(z 


1500 


127476 


40.1 


0.796 


lb(z 


) 500 


132662 


1.18 


0.421 


lb(z 


) 1000 


132662 


2.36 


0.491 


lb(z 


) 1500 


127476 


3.50 


0.517 


lc(z; 


500 


132662 


0.147 


0.154 


lc( z ; 


1000 


132662 


0.295 


0.209 


ic(z; 


1500 


127476 


0.436 


0.234 


2b(z 


) 500 


133055 


1.57 


0.460 


2b(z 


) 1000 


133055 


3.15 


0.526 


2b(z 


) 1500 


127917 


4.76 


0.551 


2c( z ; 


500 


133055 


0.271 


0.222 


2c( z ; 


1000 


133055 


0.541 


0.280 


2c(z; 


1500 


127917 


0.811 


0.302 




R = 


20.0/i _i Mpc 




la(z 


500 


131562 


53.3 


0.796 


la(z 


1000 


131562 


106 


0.834 


la(z 


1500 


131562 


160 


0.849 


lb(z 


) 500 


131562 


4.61 


0.507 


lb(z 


) 1000 


131562 


9.24 


0.584 


lb(z 


) 1500 


131562 


13.9 


0.619 


lc(z; 


500 


131562 


0.566 


0.221 


ic(z; 


1000 


131562 


1.14 


0.303 


ic( z ; 


1500 


131562 


1.72 


0.343 


2b(z 


) 500 


132665 


6.47 


0.563 


2b(z 


) 1000 


132665 


12.9 


0.634 


2b(z 


) 1500 


132665 


19.4 


0.660 


2c( z ; 


500 


132665 


1.09 


0.312 


2c( z ; 


1000 


132665 


2.17 


0.385 


2c( z ; 


1500 


132665 


3.25 


0.415 



clusters and the number of galaxies in the densest cell. 
The results show that the lc(z) and 2c(z) samples in- 
deed traces only the brightest galaxies with most lc(z) 
and 2c(z) cells having less than 20 galaxies. 

4.2. Cluster Energies 

To estimate the instantaneous scaled energies T + and 
W*, we need further assumptions. The first is that the 
masses of all the galaxies in the cell are approximately 
equal to their average mass. This is generally a reason- 
able assumption for our purpose because the estimated 
values of T* and W* are not very sensitive to uncertain- 
ties in the galaxy masses. For example, figure 6 of |paper"l| 
suggests that a difference in the mass of a factor of a few 
will lead to a worst case error in W* of about 10%. This 
is mainly because uncertainties from the transverse ve- 
locities and the anisotropy of a cluster's shape are likely 
to contribute more importantly to uncertainties in the 
energy estimates. 

Using the equal mass assumption, we can calculate an 
approximate center-of-mass for the cluster in both pro- 
jection and in redshift space. Using this center-of-mass, 
we determine a galaxy's position on the sky r± and radial 
peculiar velocity vu with respect to the cluster's center- 
of-mass. This gives us structure information using the 
three observable quantities in phase space. The other 
three quantities, namely the radial position rii and the 
transverse velocities giving v± , will have to be estimated 
from further assumptions. 

The next assumption is that the point-mass approxi- 
mation is a good approximation for the potential of an 
individual galaxy. This is generally true for the cells we 
use because the half-mass radius of a galaxy is small com- 
pared to the cell radius. Under this approximation, e/R 
is small and the ((e/Ri) and n(r/e) factors are essentially 
unity. 

4.2.1. Kinetic Energy 

To estimate the instantaneous scaled kinetic energy T* 
for a cell in the sample, we suppose that the velocity 
distribution and spatial distribution averaged over the 
entire cell is isotropic. This assumption gives 

{v 2 ) = (vf) + K) (21) 

for total peculiar velocity v, radial velocity v» and trans- 
verse velocity v±. Here the angle brackets denote the 
average over the cell. However, because the transverse 
velocity cannot be observed, we must use a free param- 
eter, v, to describe the transverse velocity so that 



(v 2 ) = (vf) + K) 




tfy. (22) 



Table 3 

z-band Summary of Clusters 

A(cz) Clusters Clusters Maximum N 
km/s (N < 20) (All) 



sumption then gives 



Sample 







R = 2.0/1" 


1 Mpc 




la(z ; 


500 


16624 


16760 


47 


la(z y 


1000 


13654 


13963 


75 


la(z 


1500 


11515 


11901 


92 


lbfz 


500 


2051 


2051 


9 


lb(z 


1000 


2132 


2132 


11 


lb(z 


1500 


2094 


2094 


13 


lc(z) 


500 


125 


125 


4 


lc(z) 


1000 


147 


147 


1 


lc(z) 


1500 


161 


161 


5 






R = 5.0/1- 


1 Mpc 




la(z ; 


500 


6122 


7104 


105 


la(z / 


1000 


3265 


4586 


170 


la(z 


1500 


2046 


3466 


208 


lb(z 


500 


2260 


2260 


16 


lb(z 


1000 


2027 


2029 


21 


lb(z 


1500 


1802 


1806 


21 


lc(z) 


500 


242 


242 


6 


lc(z) 


1000 


288 


288 


7 


lc(z) 


1500 


301 


301 


7 






R = 10.0/V 


_i Mpc 




la(z 


500 


1165 


2471 


183 


la(z ; 


1000 


271 


1362 


304 


la(z 


1500 


103 


963 


367 


lb(z 


500 


1477 


1489 


29 


lb(z 


1000 


1020 


1059 


39 


lb(z 


1500 


780 


835 


15 


lc(z) 


500 


295 


295 


9 


lc(z) 


1000 


318 


318 


13 


lc(z) 


1500 


307 


307 


13 


2b(z 


500 


4893 


4941 


37 


2b(z 


1000 


3209 


3398 


51 


2b(z 


1500 


2316 


2608 


61 


2c(z) 


500 


1654 


1654 


11 


2c(z) 


1000 


1593 


1593 


15 


2c(z) 


1500 


1455 


1455 


16 






R = 20.0/i" 


~ l Mpc 




la(z ; 


500 


31 


733 


413 


la(z) 


1000 


2 


385 


606 


la(z 


1500 





260 


758 


lb(z 


500 


593 


656 


58 


lb(z 


1000 


215 


361 


85 


lb(z 


1500 


129 


262 


95 


lc(z) 


500 


303 


303 


11 


lc(z) 


1000 


230 


231 


21 


lc(z) 


1500 


191 


192 


23 


2b(z 


500 


1686 


2008 


68 


2b(z 


1000 


577 


1122 


103 


2b(z 


1500 


233 


782 


122 


2c(z) 


500 


1257 


1259 


22 


2c(z) 


1000 


887 


892 


25 


2c(z) 


1500 


674 


690 


33 



In the case of a cell with isotropic velocities, v 2 = 3. 

To rescale the units of velocity to dimensionless units of 
cell radii per dynamical time, we estimate the dynamical 
time of the cell from its crossing time. A simple estimate 
of the crossing time is the time a test particle moving at 
a velocity \J{v 2 ) needs to traverse the radius of a cluster 
from its center of mass (r) . Thus we have 



7"dyn 




(23) 



where we define the dynamical time Td yn as proportional 
to the estimated crossing time r cross . The isotropic as- 



7"dyn 




(24) 



where k T is a free parameter that describes the relation- 
ship between the dynamical time and estimated crossing 
time and the uncertainties in the estimates of the cell 



crossing time. Using equation (71) of paper 1 for the 
scaled kinetic energy 



T* = 



4 1 ((vr dyn ) 2 ) 
9R 2 ((e/R) 



and equation (24), T* is 

rp 4 2 

1* = -v = 



A T Z 

2 7,2 'cross 



9 



R 2 



(25) 



(26) 



where, k T and v combine into a single dynamical free 
parameter. 

4.2.2. Correlation Potential Energy 

To estimate the correlation potential energy, we as- 
sume that it is essentially an extensive quantity. This is 
a reasonable approximation because at scales where the 
two-point correlation function is negligible, the expan- 
sion of the univer se exactly cancels the smoothed back- 
ground potential (Saslaw & Fang 1996) so we can ignore 
the background contribution for cells that are larger than 
the scale at which the two-po int correlation functio n £2 is 
negligible. At smaller scales, Saslaw fc Fang| ( 1996 1 show 
that extensivity is also a good approximation because 
the correlation energy within a cell is much greater than 
the correlation energy between cells. This means we can 
calculate the correlation potential energy by considering 
only gravitational interactions within the cell. 

From |paper l| the scaled c orrelation potential energy 
is (c.f. equations (64), (65) of paper 1) 



IT* = -- 



8 



97V 2 

8 
9N 2 



E 



1 



l<i<j<N 



E 



R 



(27) 



(^+^.ll) 1/2 



l<i<i<AT \ v 'J,- 1 - ij, 



where we separate the 3D pairwise separation nj — 
( r ii ± + r ii 11) m t° its transverse component r^x and 
radial component r^ 11. Because we do not have precise 
distances to galaxies within a cell, we separate out the 
radial component with a parameter 77^ such that 




1 ^ z 

Substituting this into equation (|27j gives 

4(jy - 1) 1 / r 

9N v\n^I 



(28) 



(29) 



where 77 is the averaged value of the individual 77^ such 
that 

1 = #*V (30) 

For a uniform spherical cell, the average projected pair- 
wise separation is approximately 



R 



1 



r ij,± 



R 



NirR 2 

i nr 

2 V TV 



27rr 



TV 
^R2 



(i? 2 - r 2 )dr 



(31) 



where (l/r^j.) scales as 1/s/N because denser and more 
clustered cells are more likely to have close pairs than less 
dense cells. Comparing this to the avera ge pairwise sep- 
aratio n (1/rij) — 3/2 for a spherical cell Yang & Saslaw 

(2012), we get 77 ~ 1.69vTV for a uniform cell. 

Although the cells we use are cylindrical, the clusters 
that the cells contain are likely to be spherical or ellip- 
tical. This means that these estimates are reasonable. 
Furthermore, actual clusters are also likely to have in- 
ternal structures that will cause a departure from the 
uniform case. Therefore, the uniform and isotropic val- 
ues of vk T and 77 are a baseline for comparison, and may 
not necessarily represent a particular system. 

4.2.3. Correlation Virial Ratio 

From the kinetic energy and the correlation potential 
energy, the correlation virial ratio is 



v> 



27; 



1 (TV - 1) (R/r ij>± ) 



rjv 2 k 2 . 



2N (tLJR 2 )(v() 



(32) 



which we write in terms of the observables r,-,- i and vn 
and the unobserved anisotropy and dynamical parame- 
ters 77 and vk T . Here again, we can group the anisotropy 
and dynamical parameters into a single modification fac- 
tor r}v 2 k 2 and relate ip to the observables N, R, r± and 
I'll . With independently determined values of 77, v and 
k T , this relation provides a prediction for the observed 
virial ratio histogram. 

5. COMPARISONS WITH THEORY 

To compare theory with observations, we calculate T* , 
W* and ip for each cell con taining an identified clus- 
ter using equations pL p6J ) and (29 1. From these re- 
sults, we can construct anistogram of scaled energies 
for cells with given TV. To compare this histogram to 
observations, we compare the observed histograms to 
the quasi-equilibrium probabilities P{ip\ < ip < ip 2 ), 
P(T M < T» < 2\ 2 ) and P(W*,i <W*< W* )2 ). 

However, the observations also contain significant un- 
certainties that will cause the estimated values of T* and 
W 7 * to deviate from their true values. Some of these un- 
certainties, such as the anisotropy, come from inherent 
limitations in the observations. Others, such as depar- 
tures of instantaneous values of T* and W* from their 
quasi-equilibrium values, are a result of dynamical fluc- 
tuations. Such fluctuations are an intrinsic property of 
a self gravitating system and early TV-body si mulations 
have shown that they may be as large as 20% ([Aarseth] 



fc Saslawj|1972 |. Because of these fluctuations, it is rea- 
sonable to expect that the instantaneous properties of 
physical systems will only agree with average values on 
a statistical basis. 

We can minimize the effect of some of these uncer- 
tainties which result from the free parameters vk r and 
77 which we expect to be of order unity. To do this, we 
determine the values of vk T and 77 that minimize the 
total least squares distance between the expected and 
observed histograms. This provides an overall statisti- 
cal correction to the anisotropy, but does not remove the 
intrinsic scatter that results from differences in the de- 
tailed structure of a cell and chance fluctuations in the 
phase space configurations of cells. 

To incorporate these uncertainties using a simple 
model, we convolve the expected probabilities of T* , W* 
and ip with normal distributions having zero mean, and 
variances o\ if , aw* 2 and <j\. This is generally reasonable 
since these uncertainties are likely to be combinations of 
a variety of different independent factors that smooth 
out the theoretical distributions and lower their peaks. 
We therefore obtain the expected observed histograms by 
convolving the theoretical ones with normal distributions 
of the observed uncertainties: 



P(r* )0 bs) ~ P(T«(vk T ) 2 ) *Normal(0,<4„ 



P(W*, ob8 ) 



P(W*/?7)*Normal(0,cr^ 



(33) 
(34) 



and 



P(V>ob s ) ~ P{-iP/(w K)) * Normal(0, a%). (35) 



Because these normal distributions represent combina- 
tions of various observational uncertainties and dynam- 
ical fluctuations, the variances ut* 2 , <Jw* 2 an d <A do 
not have simple interpretations. Therefore we focus our 
analysis on the anisotropy and dynamical parameters 77 
and vk T . 

We can estimate these observational variances by 
searching for their values which minimize the least- 
squares distance between the observed and theoretical 
histograms for T* and W*. To ensure that there are 
enough clusters for reasonable statistics, we consider 
samples with at least 50 clusters and use histograms with 
32 bins to smooth out fluctuations in the observations. 
The n, we use the inferred values of 77 and vk T in equation 
( 35 ) to determine the best-fit value of a^, and thus the 
expected histogram for ip. 

To estimate uncertainties in the values of 77, vk Tl cr^, 
a Wif and er^, we use a jackknife procedure that leaves 
out 10% of the sample in each instance. For this analy- 
sis, we sort the clusters by galactic longitude and select a 
contiguous subsample that contains 90% of the clusters. 
This resampling method incorporates spatial information 
and is a simple method to incorporate cosmic variance. 
Since there are less than 2000 clusters for each sample 
with given TV, we calculate the resulting values for every 
possible subsample. This gives us an ensemble of values 
from which we can determine the 1-cr range of each un- 
certain value. We summarize these results in tables 4 [5] 
and [6] and plot some of these fits in figures [4j [5] and 6 

5.1. Discussion and Analysis 
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Table 4 

Best fit r;, uk T , aw*, T* and a^, and uncertainties for N = 5 



Sample A(cz) 


Clusters 


W* Histogram 


T* Histogram 


i/> Histogram 




km/s 




V 


(T14/* 


vk T 


Uf* 


(vk T ) 2 r) 


CV 












i? = 2.0/i~ 


1 Mpc 








la(z 


500 


1614 


q 7<i+0.3o 

°-' -0.33 


9 r9 + l.Ub 

z " JZ -0.64 


1 47+o.ovi 

1 -^' -0.066 


a 1c + 1.02 
6 ' 16 -1.00 


8.08+i;- 


, t-9+0.22 
1 - oz -0.20 


la(z 


1000 


1317 


46 +0.33 
°-*°-0.42 


2 ,6 +1 1S 
Z ' OD -0.69 


1 41+O.O62 
1 '^ 1 -0.060 


6 31+0- 92 
°"- ) -0.93 


6.90li;| 3 


9 10+O.66 


la(z 


1500 


1204 


q 15+0.32 


2 39+ 124 
^' Oy -0.69 


1 oQ+0.056 

^•^-o.oss 


t- O4+0.83 

5 - S4 -0.85 


6 04+ 1 ' 17 

D - U4 -1.07 


2 41+0- 70 
z -^ 1 -0.78 


lb(z 


) 500 


53 


2 95+ - 40 
z - yo -0.54 


4 29+ 2 ' 29 
^■ zy -2.53 


1 40 +o.080 

1 '^ u -0.063 


q KK+2.84 


5-8o±i:tl 


9 trq + 1.65 
z - do -0.36 


lb(z 


) 1000 


78 


19+O.2I 
°- 1 -0.14 


1 7 o+0.6i 


1 qo+0. 061 
1 -°°-0.060 


6 5"S+ 101 
D ' oo -1.06 


5 90+0.97 
o.»u_ 75 


1 45+0.49 
i -*°-0.22 


lb(z 


) 1500 


72 


9 ofi+0.24 
z - Sb -0.20 


1 11+0.64 
l- 1 -0.20 


1 o A +0.060 
1 -°^-0.056 


e-sstl; 1 !? 


5.iot :? 6 


2-io + ;f 5 










R = 5.0/i" 


1 Mpc 








la(z 


500 


542 


cq+U.34 
o.o»_ 40 


1 oq+0.62 
l.OO_ 21 


1 r: fi +0.076 
i - OD -0.079 


fi 1S+ 1U2 
D ' 1,:5 -0.94 


8.75+i™ 


i.esiHi 


la(z 


1000 


243 


14 +0.21 


1 11+0.33 

i-H-o^s 


1 47 +0.069 
l- 4 <_0 .065 


6.73+;;° 8 5 


5 75+1.14 
D -'°-0.89 


1 fi7 +0.53 
!-D'_0.42 


la(z 


1500 


135 


9 vn+0-28 
z -' u -0.16 


1 01+0-57 

i- yi -o.57 


1 q R +0.052 
i - OD -0.056 


6.36t°;^ 


4 9R+0-95 
4 - yD -0.67 


1 59+0-69 


lb(z 


) 500 


186 


3-30±g;lS 


9 OO+0.87 


1 qn+0-049 
i - w -0.045 


591+0-93 

°' yl -0.91 


5 6n +°' 93 
°- DU -0.99 


1 55+0.51 


lb(z 


) 1000 


209 


2.96+°; 37 


1 Q4+0.50 
!- y4 -0.60 


1 9C -+0.041 
1 ' ZO -0.039 


5 04+O.68 


4.58i°;! 5 


1 69+ ' 65 
i.oy_ 29 


lb(z 


) 1500 


202 


9 01+0. 27 
z -° 1 -0.15 


, t-9+0.59 
1- OZ -0.46 


1 9fi +0.039 
i - ZD -0.040 


t- t-7+0.85 
°-°'-0.89 


4-441S-.S 


1 Q0+ ' 79 
4 - yu -0.39 










R = 10.0/T 


_i Mpc 








la(z 


500 


76 


3 Q2+ u - 4i 
°- az -0.18 


9 91+0.62 
z ' z -0.42 


1 09+O.I35 
1 ' oz -0.119 


6.28+i;- 


l3.0+ 3 ;5 7 


2 30 +U ' 49 
Z - OU -0.48 


lb(z 


) 500 


171 


q 1 5+0.70 
ci - lo -0.13 


1 91+O.5O 
1,z -0.18 


1 4Q+0.064 
1 ' w -0.062 


6.92tiii 


6 12+ 2 ' 06 
D - lz -0.76 


1 49+0-44 
1 -^ y -0.28 


lb(z 


) 1000 


120 


9 7 .1+0.35 
z -'^-0.21 


1 96+°' 92 
l-»°_0 .91 


1 99+O.O49 
1 ' zz -0.047 


c cc+0.84 
6 - 66 -0.89 


4 06+ ' 89 
*- UD -0.60 


1 34+1-09 
1 '°*-0.19 


lb(z 


) 1500 


58 


9 50 + O.I6 
z - oo -0.10 


9 9Q + 1.15 

^'^ a -0.94 


1 9 7 +0.048 
1 - z ' -0.050 


11+1.73 

8 - n -i.65 


4 15+O.6O 
^• lo -0.48 


9 15+123 
z - lo -0.37 


2b(z 


) 500 


516 


q R7+0.22 
°- D '-0.19 


1 47+0.40 

1 -^'-0.36 


1 cro+0.075 
1 -°°-0.066 


5 55+0.95 
°' oo -0.91 


9 14+1-50 
M - ±4 -1.18 


1 57+0.14 
I -°'-0.16 


2b(z 


) 1000 


309 


, 04+°- 35 

°- u -0.23 


1 yo + 1.06 

J-'»_0.47 


1 iq+0.058 
1 '^°-0.052 


5 04+0-70 
°- u *-0.71 


6 24+ 1 ' 29 
u - z *-0.89 


9 riQ+0.62 
Z - UO -0.68 


2b(z 


) 1500 


177 


9 O1+0.19 
Z - Si -0.12 


1 .n+°' 52 

l.dU_ 25 


I 40+O.O6I 
i - 4u -0.058 


7 06+ ' 98 
'■ UD -0.96 


5 50+O.88 
°- ou -0.66 


1 61 +(K(i2 


2c( z ; 


500 


98 


3 06+ ' 46 
°- UD -0.14 


1 63+ ' 89 


1 4fi +0.090 
!'™-0.086 


8-i2il:S 


6 5.+ 1 -93 


9 05+O.77 
z -°°-0.73 


2c( z ; 


1000 


118 


2 84+ ' 32 
z -°^-0.15 


1 90+ ' 41 
1 ' au -0.45 


1 oc+0.063 

■"■• Od -o.o57 


5 47+0.98 
°'^'-0.96 


5 29+ 1 - 14 
°- zu -0.67 


91+0' 14 
u - al -0.24 


2c( z ; 


1500 


142 


C\ yq-(-0.33 
z - ,,;> -0.20 


1 qo+0.59 
i.ao_ 72 


1 9R +0.044 
J - ZO -0.049 
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Table 5 

Best fit r;, vk T , aw-*, °"T* an d °"i/>i an d uncertainties for TV = 10 
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Table 6 

Best fit r), vk T , aw*, &T* and a,p, and uncertainties for N = 15 



Sample A(cz) Clusters W* Histogram T* Histogram ip Histogram 
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Figure 4. Observed and predicted histograms for T» for cells with 1000 km/s velocity dispersion. The solid line is the observed histogram 
and the dashed line is the expected histogram. Top row: la(z) sample, R = 2.0h~ 1 Mpc; Middle row: la(z) sample, R = 5.0h~ 1 Mpc; 
Bottom row: 2b(z) sample, R = 10.0/i _1 Mpc. Left column: N = 5; Right column: A' = 10. 
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Figure 5. Observed and predicted histograms for W* for cells with 1000 km/s velocity dispersion. The solid line is the observed histogram 
and the dashed line is the expected histogram. Top row: la(z) sample, R = 2.0h~ 1 Mpc; Middle row: la(z) sample, R = 5.0h~ 1 Mpc; 
Bottom row: 2b(z) sample, R = lO.O/i -1 Mpc. Left column: N = 5; Right column: N = 10. 
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Figure 6. Observed and predicted histograms for ip for cells with 1000 km/s velocity dispersion. The solid line is the observed histogram 
and the dashed line is the expected histogram. Top row: la(z) sample, R = 2.0ft —1 Mpc; Middle row: la(z) sample, R = 5.0ft —1 Mpc; 
Bottom row: 2b(z) sample, R = 10.0ft -1 Mpc. Left column: N = 5; Right column: JV = 10. 



15 



The observed and expected histograms agree with each 
other in the case of T* , but the observed W* and ip his- 
tograms have a long tail that is not accounted for by the 
theory. This long tail is a result of close galaxy pairs in 
r_i_ that make a large contribution to the estimated W* . 
These close pairs may be visual pairs that are separated 
by a large distance along the line-of-sight. Such pairs will 
have a large value of 77^ that will increase the averaged 
value of rj. 

Other close pairs may be merging pairs that are bet- 
ter treated as a single extended galaxy since they are 
tightly bound and will eventually merge into a single 
galaxy ( Yang et al.|2011 ). For this reason, we can expect 
a population ot outliers that have a very negative value 
of W* . While it may be possible to identify merging pairs 
by their dynamics or morphologies, such an analysis is 
beyond the scope of this paper. 

To quantitatively estimate the goodness of fit, we cal- 
culate a quantitative measure of the agreement between 
theory and observa tions using a one-sample Kolmogorov- 
Smirnov test (e.g. Siegel|[l956 chapter 4). We calculate 
the statistic Z for a sample of n clusters using 

Z = y/nSUp |F obs ,i - Fexp.il (36) 

i 

where sup^ denotes the largest absolute value of the dif- 
ference between the cumulative observed and expected 
probabilities -F bs,i and F exPii . These cumulative proba- 
bilities are defined as 



F, 



obs 






obs , i 



(37) 



for observed probability P bs.i in bin i, and the expected 
cumulative probability -Fexp.i is similarly defined. To get 
the probability of obtaining a histogram that is as ex- 
treme as the observed histogram, or p-value, we com- 
pare the Z statistic with the Kolmogorov-Smirnov dis- 
tribution. We summarize the results of the Kolmogorov- 
Smirnov test for the T*, W* and ip histograms in table 

m 

We find that we cannot reject the null hypothesis that 
T* follows the theoretical expected distribution for al- 
most all the instances at the 95% level. However, this 
is not the case for the W* and ip histograms where most 
instances are unlikely to agree with the theory. This is 
expected because of the abnormally long tails that come 
from the close two-dimensional projected galaxy pairs in 
a dense cluster. 

5.2. Comparison between cell sizes 

Comparing the histograms among the a, b and c sam- 
ples, we see that the theory agrees with observations for 
different selection cuts and shows that it is a good de- 
scription of clustering on a wide variety of scales. These 
may be as small as a 2.0h~ 1 Mpc group or as large as a 
20.0h~ 1 Mpc supercluster. To illustrate this, we plot the 
histograms for different selection cuts at the cell size of 
R = lO.Oh,- 1 Mpc and A(cz) = 500 km/s with 5 galaxies 
in figure [7] to demonstrate the agreement between theory 
and observations at these different scales. 

5.3. The Relationship Between rj and vk T 

From tables [4j [5] and [6j we see that the value of 
77 increases with vk r ana suggests that, as expected, 



anisotropy in the shape of a cluster is correlated with 
anisotropy in the velocity distribution of a cluster. To 
illustrate this increase, we plot vk T against rj in figure IS] 
and show that this relation also depends on the number 
of galaxies in a cell. 

When we plot vk T against rj/vN, the points from in- 
stances with different N line up with a weak correla- 
tion between vk T and rj/y/N. These data points in- 
dicate that vk r and r)/y/~N are close to their isotropic 
and uniform values. For an isotropic velocity distribu- 
tion v = v3 « 1.73 which is somewhat higher than the 
observed values of vk T . This suggests that k T < 1 if the 
clusters have an isotropic velocity distribution. This is 
generally reasonable since we are averaging over a sam- 
ple of clusters that may have any orientation, and hence 
would on average be isotropic. 

For a uniform spherical cell, rj/y/N « 1.69 which is also 
close but slightly less than the observed values, indicat- 
ing that the identified clusters are close to, but not quite 
uniform. This suggests that the pairwise radial separa- 
tion is larger than what is expected from a uniform cell, 
which may indicate the presence of internal structure, or 
a non-spherical shape for the cluster. A physical expla- 
nation for this difference is likely to be a combination 
of these factors and will depend on detailed models of 
galaxy clusters. 

Comparing the probabilities for the W* histograms for 
different N, we see that the instances with denser cells 
tend to have higher probabilities. This is because the 
number of pairs in a cell scales as TV 2 , so that the larger 
number of pairs in a denser cell will lessen the relative 
contribution to W* from a close visual pair even though 
there are more of them. Less dense cells have significantly 
fewer pairs, and thus a very close pair will very easily 
dominate W* and result in a more pronounced tail. 

6. CONCLUSIONS 

This paper develops a method to find clusters of galax- 
ies in a sky survey and estimates their scaled kinetic and 
correlation potential energies. Using the low redshift 
galaxies from DR7 of the SDSS, we identify a popula- 
tion of galaxy clusters and estimate the scaled energies 
of the cells that host them. 

The distribution of scaled kinetic energy T* , scaled cor- 
relation potential energy W* and correlation virial ratio 
tp generally agree w i th the theoretical predictions derived 
by Yang & Saslaw (2012) with free parameters that de- 
scribe the b-dimensional phase-space structure of a clus- 
ter. These parameters, rj which describes the shape of 
a cluster, v which describes the velocity anisotropy and 
k T which describes the relation between the dynamical 
timescale and the crossing time, provide a new statisti- 
cal method for estimating the radial distances, transverse 
velocities and the average mass of a galaxy. 

In addition to these structural parameters, we have 
also introduced the parameters aw*, &t* and a^ to 
model the statistical uncertainties in the estimated clus- 
ter energies. These parameters are the standard devia- 
tions of normal distributions that we convolve with the 
theoretical distribution of cluster energies and represent 
a combination of observational uncertainties and dynam- 
ical fluctuations. These may include uncertainties related 
to the detailed internal structure of a cluster that we can- 
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Table 7 

Kolmogorov-Smirnov test statistic and histogram probabilities 
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not observe, and intrinsic fluctuations in the phase space 
configuration of clusters. The physical values of these 
parameters can eventually be determined from detailed 
models of clusters of galaxies, or suitably designed N- 
body simulations. 

However, while the anisotropy parameters vk T and 77 
may be measured from a snapshot of the 6-dimensional 
phase space information of a sample of clusters from a 
simulation, the intrinsic fluctuations of a cluster about 
quasi-equilibrium require more information. A detailed 



analysis of such fluctuations is likely to require multiple 
snapshots of multiple clusters, at intervals considerably 
shorter than a crossing time. These requirements gener- 
ally preclude the use of archived simulations since they 
are not archived with a sufficiently high time resolution. 
This is in order to track both the quasi-equilibrium aver- 
age energies of a cluster and its fluctuations, and obtain 
the distribution of energies about quasi-equilibrium and 
its contributions to ctt*, <?w* and a^p. 
Furthermore, the theory discussed in this paper and 
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Figure 7. Observed and predicted histograms for T* for 10.0h~ 1 Mpc cells with 500 km/s velocity dispersion illustrating the agreement 
with theory across different sample cuts. The solid line is the observed histogram and the dashed line is the expected histogram. Top left: 
la(z) sample; Top right: lb(z) sample; Bottom left: 2b(z) sample; Bottom right: 2c(z) sample. 



|paper 1| suggests a strong connection between the struc- 
ture of a cluster and the environment that it exists in. 
This suggests that the merger history and dynamics of a 
cluster is an important factor that determines its inter- 
nal structure. Thus relating a high-resolution simulation 
to observations is considerably more complicated than a 
simple semi-analytic model, and in our case, even more 
so because we are interested in the detailed substructure 
of a cluster of galaxies. 

Quantitatively, the observed distribution of T* agrees 
with the theoretical distribution convolved with a nor- 
mal distribution, and is statistically significant at the 
95% level for most of the instances we have examined. 
However, the observed distributions for W* and ip have 
a long tail that does not agree with theory. This tail 
is likely to be caused by the presence of a population 
of merging galaxies. These merging galaxies are very 
close to each other, and contribute to a very negative 
W*. These pairs are likely to eventually become a sin- 
gle gala xy, and should b e modeled as a single extended 
galaxy ( |Yang et al.||2011[ ). 



In order to account for this long tail, we need to identify 
the merging pairs and consider them as a single extended 
galaxy. We do not do so in this paper because such an 
analysis warrants a much more detailed treatment to deal 
adequately with the merger classification methods and 
would be better addressed in a separate paper. 

We also find that the quasi-equilibrium theory of 
galaxy clusters holds for a large range of scales. These 
range from small groups in cells of 2.0/i~ 1 Mpc radius, 
to large supercluster scale structures in cells of 20.0h~ 1 
Mpc radius. This agrees with the result that the GQED 
agrees very well with t he counts-in-cells distribution at 
a wide variet y of scales ( Yang fc Saslaw|2011 Sivakoff & 
Saslaw|[2005] . 



By analyzing different samples of galaxies and clusters, 
we have also found that vk T is weakly correlated with 
rj/vN and indicates that the velocity anisotropy and 
position anisotropy of a cluster are weakly correlated. 
However a more general result is that vk T and T)/VN 
are close to the isotropic and uniform values which show 
that, on average, clusters are approximately isotropic, 
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Figure 8. Left plot: Relation between uk T and rj; Right plot: Relation between uk T and rj/y/N. 



and are close to, but not quite uniform collections of 
galaxies. 

We conclude that the an alysis he re suggests that the 
quasi-equilibrium theory in paper l| is a good description 
of galaxy clustering when the uncertainties and fluctua- 
tions in the cluster kinetic and correlation potential ener- 
gies are incorporated. While some of these uncertainties 
serve to broaden the distribution, the spatial and veloc- 
ity anisotropy parameters r\ and v may provide further 
insights to the internal structure of galaxy clusters on 
a statistical basis. These parameters, and the intrinsic 
fluctuations around quasi-equilibrium may be measured 
in suitably designed iV-body simulations. Using the con- 
clusions in this paper, we are currently working on a 
subsequent paper that will discuss simulations. 
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